AITopics | continuous control

07956d40074d6523bad11112b3225c6e-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 11:10:06 GMT

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Double Gumbel Q-Learning

Neural Information Processing SystemsApr-24-2026, 11:10:02 GMT

We show that Deep Neural Networks introduce two heteroscedastic Gumbel noise sources into Q-Learning. To account for these noise sources, we propose Double Gumbel Q-Learning, a Deep Q-Learning algorithm applicable for both discrete and continuous control. In discrete control, we derive a closed-form expression for the loss function of our algorithm. In continuous control, this loss function is intractable and we therefore derive an approximation with a hyperparameter whose value regulates pessimism in Q-Learning. We present a default value for our pessimism hyperparameter that enables DoubleGum to outperform DDPG, TD3, SAC, XQL, quantile regression, and Mixture-of-Gaussian Critics in aggregate over 33 tasks from DeepMind Control, MuJoCo, MetaWorld, and Box2D and show that tuning this hyperparameter may further improve sample efficiency.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report > New Finding (0.45)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

a439259e78294c38d157a51a2c40486b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 04:22:13 GMT

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > South Carolina (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
(3 more...)

Add feedback

f337d999d9ad116a7b4f3d409fcc6480-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 21:47:36 GMT

aac, action repetition, repetition, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > Santa Clara County > Cupertino (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Workflow (0.46)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

fb2e203234df6dee15934e448ee88971-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 05:21:03 GMT

algorithm, initialization, robust stability condition, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey (0.04)
(4 more...)

Industry: Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

acab0116c354964a558e65bdd07ff047-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 19:31:24 GMT

agent, algorithm, ktm-drl, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

a70145bf8b173e4496b554ce57969e24-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 17:05:50 GMT

apple, apple 1, decoder, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

884d247c6f65a96a7da4d1105d584ddd-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 06:33:31 GMT

DDPG [24]extends Q-learning to continuous control based on the Deterministic Policy Gradient [31] algorithm, which learns a deterministic policyπ(s;φ) parameterized byφto maximize the Q-function to approximate themaxoperator.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: